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1  SUMMARY 


The  goal  of  this  project  is  to  develop  a  new  methodology  for  seismic  multiple-event  location  which 
can  improve  the  quality  of  ground- truth  event  databases.  The  improvements  sought  include  more 
complete  and  rigorous  quantification  of  location  uncertainty,  better  quality  control  on  phase  picks 
associated  with  ground-truth  events,  and  a  greater  quantity  of  events  that  meet  specified  ground- 
truth  criteria.  To  achieve  these  improvements  we  are  pursuing  several  enhancements  to  multiple- 
event  location  methodology  applicable  to  earthquake  clusters,  such  as  the  combined  use  of  lo-cal, 
regional  and  teleseismic  arrival-time  data  in  a  unified  analysis,  and  application  of  a  rigorous 
framework  for  uncertainty  analysis  that  incorporates  more  realistic,  and  generally  less  restrictive, 
assumptions  about  unknown  travel-time  corrections  (model  errors)  attributable  to  unknown  veloc¬ 
ity  variations  in  the  Earth. 

In  the  first  year  of  the  project  we  developed  the  mathematical  framework  and  numerical  al¬ 
gorithm  for  much  of  our  more  general  approach  to  multiple-event  location.  We  applied  the  re¬ 
sulting  multiple-event  location  algorithm  to  two  event  clusters:  the  Rogun  earthquake  cluster  in 
Tajikistan,  one  of  the  clusters  in  the  IASPEI  ground-truth  database  processed  with  the  HDC-RCA 
method,  and  57  explosions  in  the  Pahute  Mesa  testing  area  of  Nevada  Test  Site  for  which  ground- 
truth  (GTO)  location  information  is  available.  Our  application  to  the  Rogun  cluster  demonstrated 
that  our  multiple-event  location  approach  emulates  the  HDC-RCA  as  a  special  case  of  its  more 
general  methodology.  The  application  to  Pahute  Mesa  explosions  showed  that  our  method  can 
achieve  epicentral  locations  with  a  median  mislocation  of  1  km  and  maximum  of  3  km,  when 
outlier  suppression  techniques  we  implemented  are  applied.  However,  our  preliminary  tests  of  a 
more  general  stochastic  model  of  travel-time  corrections,  which  uses  a  full  variance-covariance 
matrix  on  the  corrections  as  prior  information,  did  not  reduce  location  errors,  indicating  the  need 
for  further  work  on  this  element  of  our  approach. 
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2  INTRODUCTION 

Ground-truth  seismic  events  play  a  crucial  role  in  the  development  of  improved  event  location 
methods  for  nuclear  monitoring,  both  as  a  source  of  data  for  travel-time  calibration  via  velocity 
tomography  or  empirical  methods,  and  in  the  validation  of  location  and  calibration  procedures. 
While  several  non-seismic  means  for  deriving  accurate  event  locations  exist  (e.g.  explosion  cata¬ 
logs,  mine  blast  records,  satellite  imagery,  InSAR  signals),  the  most  abundant  source  of  ground- 
truth  information  is  seismic  arrival-time  data.  Although  event-location  accuracy  achievable  from 
arrival-time  data  is  limited  by  our  knowledge  of  the  Earth’ s  velocity  structure,  the  location  accu¬ 
racy  needed  for  nuclear  monitoring  applications  (<5  km,  or  GT5)  can  be  sometimes  achieved  by 
applying  multiple-event  location  methods  to  regional  and  teleseismic  data  from  a  small  event  clus¬ 
ter,  to  constrain  the  relative  locations  between  the  events,  and  using  data  from  a  well-distributed 
network  of  local  stations  to  constrain  the  absolute  locations  of  one  or  more  of  the  events.  In  partic¬ 
ular,  the  hypocentroidal  decomposition  (HDC)  method  (Jordan  and  Sverdrup,  1981;  Engdahl  and 
Bergman,  2001),  has  been  used  extensively  to  develop  a  large  GT5  data  base  for  use  by  the  nuclear 
monitoring  community  (Bonda'r  et  al.,  2004;  Bonda'r  and  McLaughlin,  2009). 

However,  good  local  seismic  networks  are  not  available  in  all  areas  of  monitoring  interest, 
which  motivated  Bonda'r  et  al.  (2008)  to  extend  the  HDC  method  with  reciprocal  cluster 
analysis  (RCA)  in  order  to  relax  the  requirements  for  local  data.  The  resulting  HDC-RCA 
method  is  a  two-step  procedure.  First,  HDC  is  applied  to  regional  and  teleseismic  arrivals  to 
determine  the  relative  locations  of  the  events  in  a  cluster  (known  an  cluster  vectors).  Second, 
RCA  is  applied  to  determine  the  absolute  location  of  the  cluster  (cluster  hypocentroid)  from 
arrival  times  observed  at  one  or  more  local  stations.  Exploiting  travel-time  reciprocity,  RCA 
treats  the  event  cluster  as  a  local  seismic  network  which  is  used  to  relocate  the  local  stations;  the 
cluster  hypocentroid  is  then  derived  from  the  resulting  station  mislocations.  RCA  does  not 
require  that  the  local  stations  be  able  to  determine  accurate  locations  of  individual  events  and 
thus  extends  the  applicability  of  the  HDC  method. 

This  project  is  attempting  to  extend  multiple-event  location  methodology  even  further  with  the 
goal  of  improving  the  quality  of  ground-truth  event  databases.  The  improvements  sought  include 
more  complete  and  rigorous  quantification  of  location  uncertainty,  especially  for  event  focal 
depths  and  origin  times;  better  quality  control  on  phase  picks  associated  with  ground-truth  events; 
and  a  greater  quantity  of  events  that  meet  specified  ground-truth  criteria,  particularly  in  areas 
poorly  represented  in  current  databases.  To  achieve  these  improvements  we  are  pursuing  several 
en-hancements  to  multiple-event  location  methodology  applicable  to  earthquake  clusters.  A 
primary  enhancement  is  the  combined  use  of  local,  regional  and  teleseismic  arrival-time  data  in  a 
unified  analysis  in  order  to  extract  all  information  on  both  absolute  and  relative  event  locations  that 
is  available  in  the  data.  Further,  we  are  applying  a  rigorous  framework  for  uncertainty  analysis  to 
characterize  the  errors  in  ground-truth  location  parameters  more  completely  and  assign  appropri¬ 
ate  ground-truth  levels,  ur  uncertainty  approach  will  incorporate  more  realistic,  and  generally 
less  restrictive,  assumptions  about  unknown  travel-time  corrections  attributable  to  unknown  ve¬ 
locity  variations  in  the  Earth.  The  improved  treatment  of  these  corrections  will  potentially  reduce 
location  uncertainty  and  allow  the  application  of  multiple-event  location  analysis  to  earth  uake 
clusters  excluded  by  current  methods,  resulting  in  more  ground-truth  events. 
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3  TECHNICAL  APPROACH 

3.1  Problem  Formulation 


We  are  addressing  the  problem  of  inferring  the  location  parameters  of  a  set  of  J  seismic  events  from 
arrival-time  data  observed  at  a  network  of  seismic  stations.  We  let  h;  denote  the  4-component 
vector  of  hypocentral  parameters  (including  origin  time)  of  the  jth  event,  and  let  t;  denote  the 
vector  of  arrival  times  observed  from  that  event,  noting  that  the  station/phase-type  combinations 
for  which  times  are  observed  can  differ  among  events.  The  multiple-event  location  problem  is  to 
solve  for  the  liy  from 


f,  /’j*  l(r  m)  <’/•  j  =  (1) 

where  Fj  is  a  forward  modeling  function  which  adds  the  origin  time  of  the  jth  event  to  travel  times 
predicted  from  its  hypocenter,  and  where  e_,  is  a  vector  of  measurement  errors  in  the  arrival-time 
picks.  The  travel  times  predicted  by  Fj  are  assumed  to  be  the  theoretical  times  through  the  Earth’s 
seismic  velocity  structure,  as  described  by  a  model  parameter  vector  m. 

In  practice,  the  Earth’s  velocity  is  not  known  and  travel-time  calculation  is  done  with  a 
refer-ence  model  mref.  We  thus  replace  eq.  (1)  with 

tj  =  Fjihj]  mref)  +  cj  +  ej,  j  =  l,...,J  (2) 

where  each  vector  c 3-  contains  travel-time  corrections,  given  by  (3) 

ci  =  Fj  (hi ;  m)  -  Fj(hj ;  mref )  • 

The  correction  vectors  clearly  depend  on  the  models  m  and  mref  as  well  as  their  respective  event 
hypocenters,  although  our  notation  does  not  show  these  dependences.  It  is  important  to  note  that,  if 
the  Cj  are  assumed  to  be  zero,  effectively  ignoring  the  inadequacy  of  the  reference  velocity  model, 
the  multiple-event  location  problem  decouples  across  events  and  can  be  solved  by  locating  each 
event  separately  from  its  own  data,  i.e.  performing  single-event  location  on  each  event. 

The  original  multiple-event  location  methods  (e.g.  Jordan  and  Sverdrup,  1981;  Pavlis  and 
Booker,  1983)  assumed  that  the  travel-time  corrections  do  not  depend  on  event  location,  which 
effectively  assumes  that  the  events  are  in  a  tight  cluster  with  location  differences  between  events 
being  much  less  than  event-station  distances.  This  assumption  can  be  stated  as 

Cj  =  S j  7  (4) 

where  the  vector  7  contains  a  set  of  “master”  corrections  (one  for  each  station/phase-type  combi¬ 
nation  observed  for  at  least  one  event)  and  the  matrix  S j ,  containing  ones  and  zeros  as  its  elements, 
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selects  the  appropriate  subset  of  master  corrections  for  the  jth  event.  Substituting  (4)  into  (2)  yields 

tj  =  Fj  (hj-;  mref)  +  Sj7  +  e,,  j  =  1, . . . ,  J.  (5) 

Further,  the  original  multiple-event  location  methods  effectively  treated  the  master  corrections  as 
deterministic  unknowns,  unconstrained  by  prior  information.  Different  methods  solved  eqs.  (5) 
with  different  numerical  techniques,  some  solving  explicitly  for  7  (in  conjunction  with  the  hj)  and 
others  eliminating  7  as  a  “nuisance”  parameter  and  solving  explicitly  only  for  the  hj . 


3.1.1  The  HDC-RCA  Method 


The  baseline  for  this  project  is  the  hypocentroidal  decomposition  (HDC)  method  as  implemented 
by  Engdahl  and  Bergman  (2001)  and  extended  with  reciprocal  cluster  analysis  (RCA)  by  Bondar 
et  al.  (2008).  The  HDC  method  decomposes  the  event  hypocenter  vectors  into  a  hypocentroid  h 
and  cluster  vectors  Ahj,  defined  by 


h  = 


Ah,= 


1 

J 


Y h/ 

3= 1 
-h. 


(6) 

(7) 


The  cluster  vectors  determine  the  pattern  of  event  locations  within  the  event  cluster  while  the 
hypocentroid  determines  the  absolute  location  of  the  cluster. 

Jordan  and  Sverdrup  (1981)  showed  that,  in  allowing  unconstrained  travel-time  corrections, 
some  information  about  the  event  locations  is  lost,  primarily  information  about  the  hypocentroid. 
This  can  be  understood  by  linearizing  the  forward  functions  Fj  about  h  and  approximating  eq.  (5) 
as 


tj  a;  Fj(h;  mref)  +  Bj  Ahj  +  Sj7  + ej,  j  =  1, . . . ,  J  (8) 

where  Bj  is  the  Jacobian  (partial  derivative  matrix)  of  Fj{ h,  m)  with  respect  to  h,  evaluated  at 
h  =  h  (and  m  =  mref).  Now  let  us  define  a  master  forward  function  F  which,  for  any  given 
hypocenter,  predicts  arrival  times  for  the  distinct  station/phase  combinations  existing  in  the  multi¬ 
event  data  set.  Given  the  selection  matrices  Sj,  we  can  write 

Fj(h;  mref)  =  SiJr(h;  mref)  (9) 

and  thus,  letting  B  be  the  Jacobian  of  F  with  respect  to  its  hypocenter  argument, 


Bj  =  SjB. 


(10) 


We  can  now  write  eq.  (8)  as 


Jr(h;mref)  +  B  Ahj +7]  +  ejr  j  —  1, 


J. 


(11) 


When  the  cluster  vectors  Ahj  are  fixed,  this  defines  an  inverse  problem  for  the  hypocentroid  h  and 
travel-time  correction  vector  7.  We  see  that  the  problem  displays  a  complete  ambiguity  between 
^(h;  mref)  and  7  and,  thus,  between  h  itself  and  7.  That  is,  given  a  potential  solution  (hi,  7-J, 
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any  other  hypocentroid  h2  will  be  an  equally  good  solution  by  setting  its  associated  correction 
vector  to 

72  =  7i  +  ^(hi;  mref)  -  ^(h2;  mref).  (12) 

The  complete  ambiguity  between  h  and  7  occurs  only  in  the  limit  of  zero  cluster  dimension 
(Ahj  — >  0)  since  linearization  of  the  Fj  is  exact  only  in  this  limit.  For  clusters  of  finite  dimension, 
the  data  will  contain  some  information  to  constrain  the  hypocentroid,  as  a  result  of  nonlinearity, 
but  it  will  generally  be  weak.  With  7  unconstrained,  only  the  cluster  vectors  are  well  determined 
in  the  multiple-event  location  problem. 

Therefore,  absolute  event  locations  will  be  well  determined  only  if  prior  constraints  on  the 
travel-time  corrections  or  on  one  or  more  of  the  event  hypocenters  is  available.  Given  that  non- 
seismic  ground-truth  event  information  is  rarely  available,  a  more  common  source  of  prior  infor¬ 
mation  on  event  locations  is  a  local  seismic  network,  which  can  provide  an  accurate  location  for 
one  or  more  of  the  events  in  a  cluster  if  the  network  is  sufficiently  well  distributed.  However,  good 
local  networks  are  not  available  in  many  regions,  which  motivated  Bondar  et  al.  (2008)  to  develop 
reciprocal  cluster  analysis.  The  RCA  method  can  use  a  small  number  of  local  stations  (as  few  as 
one)  to  determine  the  hypocentroid  of  a  cluster.  Exploiting  the  source-receiver  reciprocity  of  travel 
times,  the  method  treats  a  cluster  of  events  as  a  seismic  network  to  “relocate”  local  stations;  the  re¬ 
sulting  station  mislocations  are  then  used  to  correct  the  cluster  hypocentroid.  Thus,  the  HDC-RCA 
hybrid  method  is  a  two-step  process:  (1)  application  of  the  traditional  HDC  method  to  regional  and 
teleseismic  arrival-time  data  to  determine  cluster  vectors  and  a  preliminary  estimate  of  the  cluster 
hypocentroid.  followed  by  (2)  application  of  RCA  to  data  from  local  stations  to  refine  (shift)  the 
hypocentroid,  with  the  cluster  vectors  unchanged. 

3.1.2  Extensions 

This  project  is  extending  multiple-event  location  methodology  in  three  key  respects.  The  first  is 
to  use  all  arrival-time  data  —  including  local,  regional  and  teleseismic  —  in  a  unified  analysis.  In 
principle,  the  HDC-RCA  method  will  then  be  invoked  when  the  regional/teleseismic  travel-time 
corrections  are  unconstrained  and  local  corrections  are  set  to  zero.  Solving  the  multiple-event 
location  inverse  problem  in  a  single  step  is  expected  to  have  algorithmic  advantages  as  well  as 
facilitate  uncertainty  analysis. 

The  second  extension  to  multiple-event  location  methodology  we  are  developing  in  this  project 
pertains  to  the  numerical  algorithm  used  to  solve  the  problem.  We  are  developing  a  general  algo¬ 
rithm  based  on  grid-search  and  nonlinear  conjugate  gradients  techniques,  avoiding  matrix  inversion 
or  diagonalization.  In  addition  to  scaling  better  to  large  data  sets,  these  numerical  techniques  al¬ 
low  the  use  of  non-Gaussian  distributions  for  observational  (pick)  errors,  which  previous  work 
has  shown  to  be  an  effective  device  for  mitigating  the  effects  of  outliers  in  the  observed  data  (e.g. 
Billings  et  al,  1994;  Kennett,  2006;  Rodi,  2006). 

The  third  extension  is  to  allow  prior  information  on  travel-time  corrections  in  the  form  of  prior 
variances  on  the  corrections  and  prior  covariances  between  them.  The  variances  and  covariances 
can  be  generated  from  assumed  geostatistical  parameters  of  velocity  heterogeneity  in  the  Earth 
using  the  travel-time  covariance  modeling  technique  developed  by  Rodi  and  Myers  (2013).  The 
traditional  practice  of  allowing  the  corrections  to  be  unconstrained  effectively  assigns  them  infinite 
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prior  variance.  The  motivation  for  this  extension  is  the  hope  that  realistic,  finite  variances  will 
preserve  useful  information  about  absolute  event  locations. 

In  conjunction  with  this  third  extension,  we  are  attempting  to  introduce  event-location  depen¬ 
dence  into  travel-time  corrections.  As  defined  above,  the  travel-time  corrections  for  any  given 
event  can  be  expressed  as  eq.  (4)  with 

7  =  Jr(href;  m)  -  ^(href;  mref)  (13) 

where  T  is  the  master  forward  function  introduced  above  (calculating  travel  times  for  the  unique 
station/phase-types  in  the  data  set)  and  href  is,  by  assumption,  any  hypocenter  in  or  near  the  cluster 
(e.g.  the  hypocentroid).  We  can  introduce  a  limited  dependence  of  c3  on  li?  as  follows.  Let  lijef, 
k  —  1, . . . ,  K,  be  a  set  of  reference  hypocenter s  and  define  a  set  of  master  correction  vectors  as 

=  k=l„,.,K.  (14) 

For  example,  the  lijef  could  be  the  nodes  of  a  regular  hypocenter  grid  which,  if  it  contained  only 
one  point,  would  revert  to  the  event-independence  assumption.  Now  let  7  be  the  concatenation  of 
the  7 k  into  a  single  vector: 


Then,  c?  can  still  be  defined  by  eq.  (4)  by  redefining  the  matrices  Sy ,  each  now  containing  K 
times  as  many  columns,  to  perform  spatial  interpolation  with  respect  to  the  in  addition  to 
station/phase  selection.  Owing  to  interpolation,  S3  will  depend  on  li? . 

We  are  implementing  these  extensions  as  an  extended  version  of  MIT’s  event-location  software 
GMEL  (Rodi,  2006).  At  this  time,  our  implementation  is  restricted  to  the  special  case  of  a  single 
reference  event  and  master  correction  vector  (K  =  1),  corresponding  to  traditional  multiple-event 
location  methods  like  HDC. 

3.2  Bayesian  Framework 

Our  new  multiple-event  location  method  is  based  on  a  Bayesian  framework  in  which  prior  proba¬ 
bility  distributions  on  the  pick  errors  (e,)  and  problem  unknowns  (h;  and  7)  are  specified,  and  a 
solution  of  the  problem  is  embodied  in  a  posterior  joint  probability  distribution  on  the  unknowns 
or  specific  properties  thereof.  We  have  implemented  the  Bayesian  approach  under  the  following 
assumptions  about  prior  probability  distributions: 

•  The  pick  errors  (elements  of  the  e3)  are  mutually  independent,  zero-mean  random  variables, 
each  having  a  specified  probability  distribution,  which  may  be  non-Gaussian. 

•  The  master  travel-time  correction  vector  7  is  jointly  Gaussian  with  zero  mean  and  a  specified 
variance  (variance-covariance)  matrix  Tpn. 

•  The  event  hypocenter  vectors  h;  are  mutually  independent  and  Gaussian,  with  each  having 
a  specified  mean  hpn  and  4x4  variance  matrix  Hpn. 
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Our  multiple-event  location  algorithm  finds  the  MAP  (maximum  a  posteriori)  estimates  of  the 
event  locations  and  travel-time  corrections,  defined  to  maximize  their  joint  posterior  probability 
distribution.  Equivalently,  the  MAP  estimates  minimize  the  negative  logarithm  of  the  posterior 
distribution,  which  defines  a  cost  function  we  denote  as  Under  the  assumptions  above,  the  cost 
function  (omitting  a  constant  term)  is  given  by 


3= 1  3= 1 


(16) 


where  fiy  denotes  a  data  misfit  function  for  the  jth  event  implied  by  the  assumed  prior  probability 
distribution  for  pick  errors.  We  assume  pick  errors  have  a  generalized  Gaussian  distribution  (e.g. 
Billings  et  al.,  1994)  of  some  order  p  >1,  in  which  case  the  data  misfit  functions  are  given  by 


1 


Ni 


Fijihpm 


ref  \ 


T 


7<- 


i=  1 


(17) 


where  Ay  is  the  number  of  data  in  ty;  /:t?  is  the  zth  element  of  ty;  Ft]  is  the  forward  modeling  func¬ 
tion  for  tij,  as  extracted  from  Ay  the  vector  sy-  (which  in  general  depends  on  h; )  is  the  zth  column 
of  Sj,  the  transposed  selection/interpolation  matrix;  and  cy?  is  an  assigned  standard  pick  error. 
The  generalized  Gaussian  distribution  becomes  Gaussian,  and  <fy  becomes  a  quadratic  function  of 
the  data  residuals,  when  p  =  2. 


3.3  Travel-Time  Covariances 


Rodi  and  Myers  (2013)  developed  a  method  for  calculating  the  variance  matrix  of  travel-time 
prediction  errors  as  determined  by  geostatistical  parameters  characterizing  errors  in  the  velocity 
model  used  for  travel-time  prediction.  This  project  is  applying  this  method  to  generate  the  prior 
variance  matrix  rpn. 

The  Rodi-Myers  method  assumes  a  linear  relationship  between  model  velocity  (or  slowness) 
error  and  travel-time  errors.  We  thus  linearize  eq.  (14)  to  obtain 

Ik  ~  A fc(m  -  mref),  k  =  1, . . . ,  K  (18) 


where  Ak  is  the  Jacobian  of  Jr(  hjef:  m)  with  respect  to  m  evaluated  at  m  =  mref.  The  elements 
of  A  are  sensitivities  to  the  velocity  model  parameters  calculated  from  the  geometrical  ray,  for  the 
appropriate  station/phase,  connecting  hjef  to  the  station  location.  Concatenating  the  Jacobians  as 


A 


(Al\ 

\AkJ 


(19) 


and  recalling  (15),  we  can  write  (18)  more  concisely  as 


7  a;  A(m  —  mref). 
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(20) 


The  method  of  Rodi  and  Myers  (2013)  assumes  that  m,  the  velocity  model  corresponding  to 
the  real  Earth,  is  a  Gaussian  random  variable  with  known  mean  and  variance;  here  we  will  take 

E[m]  =  mref  (21) 

Var[m]  =  C  (22) 

where  C  is  a  given  variance  matrix.  Eqs.  (20)-(22)  imply  that  7  is  Gaussian  with  zero  mean  and 
variance  matrix  given  by 

rpri  =  ACAr  (23) 

The  Rodi-Myers  method  specifies  the  model  variance  indirectly  in  terms  of  its  inverse,  C  1 ,  which 
is  parameterized  with  geostatistical  parameters  comprising  a  velocity  variance  and  spatial  correla¬ 
tion  lengths  in  the  horizontal  and  vertical  directions.  The  method  calculates  the  fth  column  of  rpn 
as  Au,  with  u  obtained  as  the  solution  to  the  linear  system 

C_1u  =  (24) 

where  is  the  (th  column  of  AT,  i.e.  the  model  sensitivities  corresponding  to  the  fth  ray.  For  tele- 
seismic  rays,  a/  can  have  many  non-zero  entries  (compared  to  local  and  regional  rays),  requiring 
much  computational  effort  to  solve  the  linear  system  (24).  This  project  devoted  significant  effort  to 
speeding  up  this  calculation  by  avoiding  calculation  of  elements  of  u  which  are  predictably  small, 
i.e.  those  corresponding  to  velocity  model  nodes  far  from  the  ray. 


3.4  Minimization  Algorithm 


We  have  enhanced  GMEL  to  perform  the  minimization  of  T  in  eq.  (16)  with  a  hybrid  method  that 
combines  nonlinear  conjugate  gradients  (NLCG)  to  perform  minimization  with  respect  to  7  and 
grid  search  to  perform  minimization  with  respect  to  the  hr  To  be  specific,  define  a  reduced  cost 
function,  as 


'k(7;ti,...,tJ) 


min  ^(hi,...,hj,7;ti,..^tj). 


(25) 


Since  T  is  minimum  with  respect  to  the  hy  for  any  fixed  7,  minimization  of  T  with  respect  to  7 
achieves  joint  minimization  of  T  with  respect  to  the  li;  and  7.  Further,  we  can  see  from  eq.  (16) 
that,  with  7  fixed,  the  task  of  minimizing  T  with  respect  to  hypocenters  decouples  across  events. 
GMEL  performs  this  task  by  applying  grid  search  to  each  event  hypocenter  in  turn. 

NLCG  is  applied  to  the  minimization  of  T  with  respect  to  7.  This  method  is  well  documented 
in  the  optimization  literature  (e.g.  Kelley,  1999)  and  we  focus  here  on  our  implementation  of  the 
method.  NLCG  is  an  iterative  method  in  which  the  (th  solution  iterate  is  updated  according  to 
(initializing  with  70  =  0) 

7m  =  y  +  c/  p£  (26) 

where  the  vector  p'  denotes  a  search  direction  and  the  scalar  (x  a  step-length  in  that  direction.  The 
search  direction  is  generated  recursively  as 

P*  =  pg*  +  /3V_1 
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(27) 


where  g'  is  the  gradient  of  T  with  respect  to  7,  evaluated  at  •~f<  ;  P  is  a  symmetric,  positive  definite 
preconditioning  matrix;  and  is  a  memory  factor.  Eq.  (16)  and  the  stationarity  of  'E  with  respect 
to  the  hypocenters  imply  that 

g£  =  V'Jf(7f;t1,...,tJ) 

j 

=  +  (rprl)-y.  (28) 

3= 1 


Here  V  denotes  gradient  with  respect  to  7,  and  the  IT-  denote  the  hypocenters  that  minimize  for 
the  current  7,  i.e. 


=  ^(hf , . . . ,  hj,  7^;  ti 


G  • 


(29) 


A  crucial  feature  of  our  implementation  of  NLCG  in  GMEL  is  the  choice  of  the  preconditioner, 
which  we  set  to  the  prior  variance  on  7: 

P  =  Tpri.  (30) 


By  doing  so,  the  calculations  can  be  coded  to  avoid  the  use  of  the  inverse  of  rpn;  only  operations 
with  rpn  itself  are  needed.  Looking  at  (28),  we  see  this  to  be  the  case  for  the  calculation  of  the 
preconditioned  gradient,  rpn  gf,  which  is  used  in  eq.  (27)  as  well  as  in  the  formula  for  f3e  (not 
shown  here).  Additionally,  our  implementation  carries  the  quantities 


rjt  =  (rpn)-iy 

(31) 

q£  =  (rpri)-y 

(32) 

using  the  recursions 

P+ 1  P  .  P  P 

ry^  =  rj  +  a  q 

(33) 

qe  =  gE  +  &  q^1 

(34) 

which  also  avoid  operation  with  (rpn) 

-1.  The  last  term  of  the  cost  function  in  eq.  (16)  can  then  be 

calculated  using 

(V)T(rpri)"V  =  (y)V. 

(35) 

Estimation  of  Pick-Error  Variances 

The  variance  of  each  pick  error  (el3)  is  proportional  to  the  square  of  its  standard  error: 

Var [e^]  oc  ofj  (36) 

where  the  proportionality  factor  depends  on  the  order  of  the  generalized  Gaussian  distribution 
and  is  one  for  the  Gaussian  case  (p  =  2).  GMEL  allows  the  a%3  to  be  unknowns  in  the  multiple- 
event  location  problem  in  addition  to  7  and  the  hj.  As  this  capability  is  used  in  this  project,  each 
station/phase-type  pair  is  assigned  a  standard  pick  error  uk,  common  to  all  events,  such  that 


aij  uK(i,j) 
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(37) 


where  the  function  K  reports  the  station/phase-type  combination  for  an  individual  datum.  Prior  to 
updating  7  at  each  step  of  the  NLCG  iteration,  the  standard  errors  are  updated  with 

{vkjp  =  ^-  N  -  Fo(hi; mref)  - sS^T  (38) 

k  ij-K(i,j)=k 

where  Nk  denotes  the  number  of  data  for  the  A  th  station/phase  combination.  This  updating  formula 
is  derived  as  the  maximum-likelihood  estimate  of  vk ,  but  its  value  is  clipped  to  obey  prescribed 
upper  and  lower  bounds.  Since  the  arrival-time  residuals  depend  on  the  event  hypocenters,  the  grid 
searches  for  the  h;  and  the  resetting  of  the  ip-  are  repeated  until  a  convergence  is  reached,  creating 
an  inner  loop  within  the  NLCG  loop. 

3.5  Location  Uncertainty  Algorithm 

In  the  first  year  of  the  project  we  formulated  an  approach  to  the  calculation  of  hypocenter  uncer¬ 
tainty  in  the  multiple-event  location  problem,  although  we  have  not  yet  implemented  it  in  GMEL. 
The  approach  is  similar  to  that  of  Rodi  (2008)  but  promises  to  be  more  practical. 

The  primary  quantity  of  interest  is  the  uncertainty  in  the  4-component  hypocenter  vector  h; 
for  any  given  j.  In  the  framework  of  Bayesian  inference,  this  is  described  by  the  marginal  poste¬ 
rior  probability  distribution  of  h; ,  marginalized  with  respect  to  the  remaining  hypocenter  vectors, 
h ji,f  /  j,  and  the  travel-time  correction  vector  7.  This  marginal  distribution  accounts  for  the  un¬ 
certainty  in  li7  induced  by  trade-offs  with  all  these  other  parameters,  in  addition  to  that  induced  by 
pick  errors  in  its  own  observations  tr  Rodi  (2008)  considered  multiple-event  location  uncertainty 
in  a  fully  nonlinear,  non-Gaussian  (p  ^  2)  framework,  leading  to  a  computationally  intensive  algo¬ 
rithm  for  computing  location  confidence  regions.  A  much  more  efficient  algorithm  resulted  with 
the  use  of  Gaussian  approximations  to  posterior  distributions.  The  uncertainty  approach  we  have 
formulated  for  this  project  makes  use  of  such  approximations. 

The  joint  posterior  distribution  of  all  the  unknown  parameters  in  the  multiple-event  location 
problem  is  given  by 

p(h1,...,hj,7|t1,...,tj)  =  Kq  exp  {  -  tf(hi, . . . ,  hj,  7;  t1? . . . ,  tj)}  (39) 

where  K0  is  a  normalizing  constant.  The  marginal  posterior  on  a  single  event  location  vector 
(say  hi),  denoted  p(hi  |  tq, . . . ,  tj),  is  obtained  by  integrating  this  joint  posterior  over  h2, 
h./  and  7.  Given  the  nonlinearity  of  the  forward  problem  and  the  possible  assumption  of  non- 
Gaussian  pick  errors,  the  necessary  integration  cannot  be  done  analytically  or,  given  the  large 
number  of  marginal  parameters,  is  not  amenable  to  numerical  quadrature.  An  alternative  is  to  use 
sampling  techniques  (e.g.  Markov  chain  Monte  Carlo,  or  MCMC)  to  generate  an  ensemble  from 
p(hi  |  ti, . . . ,  tj),  as  is  done  by  the  BayesLoc  multiple-event  location  algorithm  (Myers  et  al., 
2007,  2009).  However,  sampling  techniques  can  be  computationally  intensive  for  large  problems. 
A  more  direct  approach  commonly  taken  in  nonlinear  inverse  problems,  suitable  for  the  case  of 
Gaussian  pick  errors,  is  to  perform  uncertainty  analysis  under  a  linear  approximation  to  the  forward 
modeling  functions,  where  the  linearization  is  done  around  the  MAP  solution.  That  is,  the  forward 
functions  are  approximated  as 

FAK  mref)  «  FAMAP;  mref)  +  B,  (h,  -  h^AP) 
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(40) 


where  B j  is  the  Jacobian  of  Ff  with  respect  to  h;  evaluated  at  the  MAP  estimate.  Under  this 
approximation,  the  joint  posterior  distribution  of  the  hy  and  7  is  Gaussian,  as  is  the  marginal  pos¬ 
terior  of  hi.  The  uncertainty  in  hi  is  thus  fully  characterized  by  the  variance  of  p(hi  1 1 1 , . . . ,  tj) 
(the  posterior  variance  matrix),  which  is  a  partition  of  the  joint  posterior  variance  matrix  of  all  the 
unknown  parameters.  Unfortunately,  while  the  inverse  of  the  joint  posterior  variance  matrix  has  a 
simple,  analytic  expression,  the  joint  variance  matrix  itself,  or  a  partition  of  it,  must  be  evaluated 
numerically  with  calculations  performed  over  the  full  set  of  unknowns. 

To  avoid  the  inversion  of  large  matrices,  as  well  as  relax  the  requirement  of  Gaussian  pick 
errors,  we  have  formulated  a  different  approach  to  calculating  an  approximation  to  the  posterior 
variances  matrix  of  a  hypocenter.  It  makes  two  related  approximations.  First,  we  approximate  the 
marginal  posterior  distribution  of  hi  by  replacing  integration  with  minimization,  i.e. 

p(hi  | ti, , .  .,tj)  «  Kx  exp  {  -  \fh(hi;ti, . . .  ,tj)}  (41) 

where  K\  is  another  normalizing  constant  and  T 1  is  the  reduced  cost  function  obtained  by  mini¬ 
mizing  T  with  respect  to  all  parameters  except  hi : 


T'i(h1;t1,...,tJ)  =  min  T(hi, . . . ,  hj,  7;  ti, . . . ,  tj). 

h2,...,hj,7 


(42) 


We  note  that  this  approximation  is  exact  just  in  case  the  joint  posterior  is  Gaussian.  Rodi  (2008) 
attempted  to  sample  this  approximate  posterior  probability  distribution  on  a  grid  in  hi  space,  which 
turned  out  to  be  very  computationally  intensive  since  the  minimization  in  eq.  (42)  is  performed  for 
each  point  of  the  sampling  grid.  To  avoid  this,  we  invoke  a  second  approximation,  that  the  marginal 
posterior  distribution  of  hi  is  Gaussian  or,  equivalently,  the  reduced  cost  function  47  is  quadratic, 
taking  the  form 

*  i(hi;  u, . . . ,  t  j) « -  hrp)r(Hr)_1(hi  -  hrp).  (43) 

This  takes  the  posterior  mean  of  hi  to  be  its  MAP  estimate  found  by  minimizing  T  with  respect  to 
all  the  unknowns.  The  problem  remains  to  calculate  the  posterior  variance  matrix  implied  by  this 
approximation,  IT1,05. 

We  have  formulated  a  perturbation  technique  to  estimate  Hjos,  which  proceeds  as  follows.  Let 
T'  denote  the  cost  function  defined  by  augmenting  T  as 


^'(hi, . . . ,  hj;  ti, . . . ,  tj)  =  d/(hi, . . . ,  hj;  t1; . . . ,  tj) 

+  i(hi  -  hrp  -  a)THo1(h1  -  hrAP  -  a).  (44) 

The  added  term  plays  the  role  of  additional,  independent  prior  information  on  hi  in  the  form 
of  a  Gaussian  distribution  with  mean  (h^AP  +  a)  and  variance  H0.  Owing  to  this  addition,  the 
hypocenters  and  correction  vector  minimizing  T'  will  be  perturbed  from  the  MAP  values  found  to 
minimize  T. 

It  is  easy  to  show  that  the  reduced  cost  function  Tj  corresponding  to  T'  is  augmented  similarly 
to  T',  that  is 


Tj(hi;ti,...,tj) 


Ti(hi;ti, . . .  ,tj) 

+  i(hi  -  hrP  -  a)rHo  ^h,  -  hrAP  -  a). 


(45) 
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The  perturbed  MAP  estimate  of  hi  minimizes  T ,  and,  therefore,  makes  the  gradient  of  T'  equal 
to  zero.  Using  the  quadratic  approximation  (43),  the  gradient  is  given  by 

VT'  =  (Hr)'1*  +  Hq  x(x  -  a)  (46) 

where  x  denotes  the  difference  between  hi  and  its  unperturbed  MAP  estimate: 

x  =  hi  -  h^AP.  (47) 

Eq.  (46)  implies  that  the  hypocenter  perturbation  minimizing  the  augmented  cost  function  satisfies 

x  =  HfosHo1(a-x).  (48) 

Suppose  now  we  solve  the  perturbed  multiple-event  location  problem  for  several  values  of  a, 
denoted  a.£  for  £  —  1, . . . ,  L,  yielding  MAP  perturbations  xt  which  satisfy 

x<  =  HrHfl-1(a<-x<)  (49) 

and  thus 

x*(a*  -  xf)T  =  Hpos  Hq  x(a^  -  x(){at  -  xe)T .  (50) 

Summing  this  equation  over  £  we  obtain 

N  =  HposHg1M  (51) 


where 

L 

M  =  T>  -  x^)(a t  -  xe)T 
£=1 
L 

N  =  -X£)r. 

t=  1 

If  L  is  sufficiently  large  (at  least  4)  and  the  vectors  a/  sufficiently  diverse,  the  symmetric  matrix 
M  will  be  positive  definite  and  we  can  write  the  desired  posterior  variance  matrix  of  h,  as 

HP  =  NM-'Hfl.  (54) 

Our  location  uncertainty  algorithm  thus  entails  solving  four  or  more  perturbed  multiple-event  loca¬ 
tion  problems  to  obtain  the  posterior  variance  matrix  for  each  hypocenter,  using  the  formula  (54). 
To  force  symmetry  on  Hpos  we  can  average  the  matrix  resulting  from  (54)  with  its  transpose.  Ap¬ 
propriate  choices  for  the  a?  and  H0  will  be  addressed  as  the  method  is  implemented  in  GMEL.  If 
the  a(  are  not  too  large,  the  computational  effort  to  solve  the  perturbed  problems  will  presumably 
require  much  less  than  for  the  original  (unperturbed)  multiple-event  location  problem. 

It  is  important  to  note  that  while  this  approach  to  location  uncertainty  relies  on  Gaussian  ap¬ 
proximations  to  posterior  distributions,  it  does  not  explicitly  assume  that  prior  distributions  are 
Gaussian.  Thus,  a  non-Gaussian  distribution  for  pick  errors  will  still  be  allowed.  The  approximate 
posterior  we  calculate  will  be  valid  to  the  extent  that  the  pooling  of  non-Gaussian  information  re¬ 
sults  in  approximate  Gaussian  posteriors  on  the  event  hypocenters  as  a  consequence  of  the  central 
limit  theorem. 


(52) 

(53) 
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4  RESULTS  AND  DISCUSSION 


To  date  we  have  applied  our  new  multiple-event  location  method  to  arrival-time  data  for  two  event 
clusters  included  in  the  IASPEI  Reference  Event  List  (REL)  to  demonstrate  and  test  our  approach. 
The  first  is  the  Rogun  earthquake  cluster  in  Tajikistan.  The  IASPEI  REL  results  for  this  cluster 
were  obtained  with  the  HDC-RCA  method  and  discussed  by  BondaT  el  al.  (2008).  The  second 
cluster  comprises  nuclear  explosions  in  the  Pahute  Mesa  area  of  the  Nevada  Test  Site.  The  IASPEI 
REL  reports  the  GTO  location  information  for  these  events,  allowing  us  to  assess  absolute  errors  in 
our  multiple-event  location  solutions.  Table  1  lists  event  and  data  counts  for  the  two  clusters. 


The  results  shown  here  for  both  clusters  were  obtained  using  only  first-arrival  P-wave  times 
and,  since  these  data  poorly  constrain  event  focal  depth,  with  event  depths  fixed  to  their  catalog 
values.  In  all  but  one  example,  the  AK135  1-D  velocity  model  (Kennett  et  al.,  1995)  was  used  as 
the  reference  model  for  travel-time  calculation. 


4.1  Rogun  Cluster  Results 

Fig.  1  compares  two  GMEL  multiple-event  location  solutions  for  the  Rogun  cluster  to  the  HDC- 
RCA  solution  reported  in  the  IASPEI  REL.  Both  GMEL  solutions  used  a  diagonal  covariance 
matrix  for  travel-time  corrections;  these  examples  did  not  use  a  full  travel-time  variance- 
covariance  matrix  generated  with  our  covariance  modeling  technique.  Travel-time  corrections  for 
teleseismic  and  regional  paths  (P  and  Pn  phases,  respectively)  were  assigned  a  standard  deviation 
of  20  s,  effectively  treating  these  corrections  as  unconstrained.  The  standard  deviation  for  local 
corrections  (Pg  and  Pb  phases),  however,  were  set  to  zero,  which  fixes  the  corrections  themselves 
to  zero. 

The  panel  on  the  left  of  Fig.  1  shows  the  GMEL  solution  obtained  with  only  teleseismic  and  re¬ 
gional  arrivals.  As  expected,  the  difference  between  the  GMEL  and  HDC-RCA  epicenters  consists 


Table  1:  Data  sets  for  two  event  clusters  in  the  IASPEI  REL. 


Rogun 

Pahute  Mesa 

Number  of  events 

17 

57 

Number  of  stations 

371 

1  205 

Number  of  arrivals 

1  356 

11  078 

P 

1  053 

8255 

Pn 

270 

2  771 

Pg/Pb 

33 

52 
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mainly  of  a  shift  of  about  13  km  in  the  cluster  centroid  to  the  south-southeast.  This  demonstrates 
the  loss  of  absolute  event  location  information  that  is  incurred  when  unconstrained  (or  weakly  con¬ 
strained)  travel-time  corrections  are  allowed  in  the  multiple-event  location  problem.  The  relative 
patterns  of  the  GMEL  and  HDC-RCA  epicenters,  however,  are  similar. 

The  right-hand  panel  of  Fig.  1  shows  the  GMEL  solution  obtained  with  arrival-time  data  at 
local  stations  included  with  the  teleseismic  and  regional  arrivals.  The  GMEL  and  HDC-RCA 
epicenters  now  agree  quite  well,  with  the  median  epicentral  distance  between  the  two  solutions 
being  3.5  km.  This  agreement  supports  the  premise  that  including  teleseismic,  regional  and  local 
data  into  a  unified  analysis,  with  appropriate  constrains  on  travel-time  corrections,  emulates  the 
two-step  HDC-RCA  procedure. 

The  GMEL  solutions  in  Fig.  1  were  obtained  with  AK135  as  the  reference  velocity  model 
for  travel-time  calculation.  Fig.  2  shows  the  corresponding  results  when  the  reference  model  is 
a  hybrid  constructed  with  the  crust  and  uppermost  mantle  velocities  replaced  with  the  local  1- 
D  velocity  model  used  by  Bondar  et  al.  (2008)  in  their  RCA  analysis.  We  see  in  the  left-hand 
frame  that  the  hypocentroid  discrepancy  between  GMEL  and  HDC-RCA  has  doubled  (to  26  km) 
compared  to  Fig.  1,  reflecting  the  fact  that  the  hybrid  model  is  worse  than  AK135  for  teleseismic 
travel-time  prediction.  Th  right-hand  frame  of  Fig.  2,  however,  shows  better  agreement  between 
the  HDC-RCA  solution  and  GMEL  solution  that  includes  data  from  local  stations,  with  the  median 
epicentral  difference  now  2.6  km.  This  indicates  that  the  local  model  used  by  Bondar  et  al.  (2008) 
is  better  than  AK135  for  predicting  local  travel  times. 

4.2  Tests  With  Pahute  Mesa  Explosions 

We  obtained  event  information  and  arrival-time  data  for  57  Pahute  Mesa  explosions  from  the 
IASPEI  REL  database.  The  data  include  arrivals  at  local,  regional  and  teleseismic  distances.  Our 
tests  used  local  arrival  data  at  only  three  stations  in  order  to  emulate  the  situation  for  which  the 
HDC-RCA  method  was  developed:  when  local  data  by  themselves  are  insufficient  to  find  accurate 
absolute  locations  for  any  of  the  events.  After  excluding  most  of  the  local  data,  our  data  set  for  the 
Pahute  cluster  comprised  1 1,078  arrivals  at  1,205  stations.  Three  quarters  of  the  arrival  times  were 
from  teleseismic  stations  and  only  52  were  from  local  stations  (see  Table  1).  The  GMEL  solutions 
were  obtained  using  AK135  as  the  reference  velocity  model  and  with  event  depths  fixed  to  their 
ground-truth  values 

4.2.1  Basic  Tests 

Our  first  test  applied  single-event  location  to  the  Pahute  Mesa  cluster.  Single-event  location  occurs 
when  all  the  travel-time  corrections  are  fixed  to  zero,  which  decouples  the  location  problem  across 
events.  The  GMEL  solution  for  the  event  epicenters  is  shown  as  the  blue  dots  in  Fig.  3,  which  also 
shows  the  mislocations  from  the  GT0  epicenters  (red  circles).  The  comparison  is  displayed  in  two 
ways.  The  left-hand  panel  compares  the  absolute  GMEL  epicenters  to  their  GT0  counterparts.  The 
panel  on  the  right  compares  a  shifted  version  of  the  GMEL  epicenters  to  the  GT0  epicenters,  where 
the  shift  was  calculated  so  that  the  centroid  of  the  GMEL  epicenters  matched  the  GT0  centroid. 
The  resulting  mislocations  thus  indicate  the  errors  in  the  relative  event  locations,  or  cluster  vectors. 
The  median  epicentral  mislocation  for  the  57  events  is  4.3  km,  with  a  3.4  km  shift  of  the  cluster 
centroid  to  the  south  accounting  for  much  of  the  mislocation. 
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Table  2:  Mislocation  statistics  for  the  epicenters  of  57  Pahute  Mesa  explosions,  displayed  in  Fig.  6. 


Variance 

Distrib. 

Median 

Largest 

Centroid 

treatment 

type 

misloc.  (km) 

misloc.  (km) 

misloc.  (km) 

fixed 

Gaussian 

2.4 

8.8 

1.9 

fixed 

non-Gaussian 

0.9 

3.9 

0.6 

adjustable 

Gaussian 

2.1 

4.4 

1.9 

adjustable 

non-Gaussian 

2.1 

4.1 

2.0 

Fig.  4  compares  a  multiple-event  solution  obtained  with  GMEL  to  the  GTO  epicenters.  As  in 
the  Rogun  example  above,  the  travel-time  corrections  for  the  52  local  (Pg)  paths  were  fixed  to  zero, 
while  teleseismic  and  regional  corrections  were  assigned  20-s  standard  deviation,  thus  emulating 
the  HDC-RCA  method.  We  see  that,  by  allowing  weakly  constrained  travel-time  corrections  for 
regional/teleseismic  paths,  both  absolute  and  relative  epicenter  mislocations  are  reduced.  The 
median  mislocation  in  this  case  is  2.4  km. 

Fig.  5  displays  the  single-event  location  and  multiple-event  location  solutions  shown  in  Figs.  3 
and  4  (respectively)  as  polar  plots  of  mislocation  vectors.  The  origin  of  each  plot  represents  the 
GTO  epicenters  of  the  explosions  (superposed)  while  each  circle  is  the  GMEL  epicenter  relative 
to  the  GTO  location,  which  thus  shows  its  absolute  mislocation.  Displayed  in  this  fashion,  it  is 
very  clear  that  the  multiple-event  location  solution  (right)  is  significantly  better  than  the  single¬ 
event  location  solution.  The  better  relative  locations  are  evident  by  the  more  tightly  clustered,  less 
scattered,  mislocation  vectors. 

4.2.2  Outlier  Suppression 

The  GMEL  multiple-event  location  solution  shown  in  the  right-hand  frame  of  Fig.  5  displays  some 
large  mislocations  (for  two  events  in  particular)  which  are  likely  due  to  outliers  in  the  arrival-time 
data  set.  We  thus  explored  the  use  of  two  techniques  for  mitigating  the  effects  of  outliers.  One 
is  to  replace  the  assumed  Gaussian  distribution  for  pick  errors  with  a  longer  tailed  non-Gaussian 
distribution,  which  we  took  to  be  a  generalized  Gaussian  distribution  with  p  =  1.25  (compared 
to  p  =  2  for  Gaussian).  The  second  device  for  outlier  suppression  we  considered  was  data  re¬ 
weighting,  whereby  pick-error  standard  errors  were  adjusted  as  part  of  the  GMEL  solution  using 
the  variance  estimation  technique  described  in  Section  3.4.1. 

Fig.  6  compares  GMEL  solutions  obtained  with  and  without  the  use  of  each  outlier  suppression 
technique.  (The  top  left-hand  panel  thus  repeats  the  right  panel  of  Fig.  5.)  Table  2  reports  the 
median  and  maximum  epicentral  mislocation  for  each  case,  as  well  as  the  mislocation  of  the  cluster 
centroid  for  each.  We  see  from  the  table  and  mislocation  plots  in  Fig.  6  that  both  outlier  suppression 
techniques  reduce  epicentral  mislocations.  The  smallest  mislocations  and  smallest  centroid  shift 
results  with  the  long-tailed,  non-Gaussian  pick-error  distribution  with  fixed  variances 
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Table  3:  Geostatistical  parameters  of  velocity  heterogeneity  (relative  to  AK135)  used  for  calcula¬ 
tion  of  prior  covariance  matrix  of  travel-time  corrections. 


Velocity 

Correl.  length  (km) 

Std.  dev. 

Horiz. 

Vert. 

Crust 

10% 

300 

17.5 

Moho  -  210  km 

2% 

300 

60 

21 0-41 0  km 

1  % 

300 

60 

41 0-660  km 

1  % 

300 

60 

>  660  km 

0.5% 

300 

120 

Table  4:  Mislocation  statistics  for  GMEL  results  in  Fig.  8,  obtained  with  a  full  covariance  matrix 
for  local,  regional  and  teleseismic  travel-time  corrections. 


Variance 

Distrib. 

Median 

Largest 

Centroid 

treatment 

type 

misloc.  (km) 

misloc.  (km) 

misloc.  (km) 

fixed 

Gaussian 

3.0 

14.1 

2.5 

fixed 

non-Gaussian 

3.0 

4.7 

2.8 

adjustable 

Gaussian 

2.6 

4.7 

2.3 

adjustable 

non-Gaussian 

2.2 

3.8 

2.0 

4.2.3  Full  Covariance  Matrix 

All  the  GMEL  solutions  presented  thus  far  followed  the  HDC-RCA  prescription  for  travel-time 
corrections,  with  large  variances  assigned  to  regional  and  teleseismic  paths  (similar  to  the  HDC 
method)  and  zero  corrections  assigned  to  local  paths  (as  in  RCA).  Further,  the  corrections  were 
assumed  to  be  uncorrelated  with  one  another,  implying  the  variance-covariance  matrix  (rpn  in 
Section  3)  is  diagonal.  The  next  results  we  show  used  a  full  variance-covariance  matrix  for  travel¬ 
time  corrections,  generated  from  a  geostatistical  model  of  the  velocity  difference  between  the  real 
Earth  and  the  AK135  reference  model.  We  considered  only  one  set  of  geostatistical  parameters, 
listed  in  Table  3.  The  parameters  are  similar  to  those  inferred  by  Rodi  and  Myers  (2013)  from 
travel-time  observations,  although  our  velocity  variances  below  210  km  depth  are  somewhat  larger. 
Using  these  parameters  we  calculated  the  variance-covariance  matrix  for  the  paths  from  an  event 
location  near  the  center  of  the  Pahute  Mesa  explosion  cluster  to  the  stations  in  the  Pahute  Mesa 
data  set.  The  travel-time  standard  deviations  derived  from  the  diagonal  elements  of  the  resulting 
1205  x  1205  matrix  are  plotted  as  a  function  of  event-station  distance  (out  to  50°)  in  Fig.  7. 

Fig.  8  shows  the  GMEL  epicenter  mislocations  resulting  with  the  use  of  the  full  variance- 
covariance  matrix  for  travel-time  corrections.  As  above,  there  are  four  cases  corresponding  to 
different  treatments  of  pick  errors.  Mislocation  statistics  for  the  four  cases  are  listed  in  Table  4. 
It  is  clear  that  the  mislocations  are  generally  larger  than  the  previous  cases  which  used  the  HDC- 
RCA  assumptions  (Fig.  6  and  Table  2).  The  smallest  mislocations  now  result  from  the  use  of 
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Table  5:  Mislocation  statistics  for  GMEL  results  in  Fig.  9,  obtained  with  zero  variances  for  local 
travel-time  corrections  and  a  full  covariance  matrix  for  teleseismic  and  regional  corrections. 


Variance 

Distrib. 

Median 

Largest 

Centroid 

treatment 

type 

misloc.  (km) 

misloc.  (km) 

misloc.  (km) 

fixed 

Gaussian 

1.6 

13.9 

1.1 

fixed 

non-Gaussian 

1.1 

5.8 

0.4 

adjustable 

Gaussian 

1.7 

3.9 

1.4 

adjustable 

non-Gaussian 

1.2 

3.1 

0.7 

both  outlier  suppression  techniques  (non-Gaussian  pick  errors  with  adjustable  variances),  although 
unlike  before  it  appears  that  the  use  of  adjustable  pick-error  variances  is  the  more  effective  of  the 
two  techniques. 

Pending  further  analysis,  we  attribute  the  larger  mislocations  resulting  with  our  travel-time 
covariance  modeling  to  the  somewhat  large  standard  deviations  that  were  calculated  for  local  cor¬ 
rections:  0.8,  0.9  and  3.1  seconds  at  stations  BGB,  GLR  and  TNP,  respectively.  The  arrival-time 
data  at  these  stations  are  thus  down-weighted  relative  to  teleseismic  and  regional  data  with  an 
accompanying  loss  of  information  about  the  absolute  event  locations. 

With  this  explanation  in  mind,  we  reran  the  cases  in  Fig.  8  using  a  modified  version  of  the  full 
covariance  matrix  with  local-station  variances  (and  associated  covariances)  set  to  zero.  The  results 
are  shown  in  Fig.  9  and  summarized  in  Table  5.  The  mislocations  are  now  comparable  to  those 
obtained  with  the  diagonal  covariance  matrix  shown  earlier,  which  also  used  zero  local 
corrections.  The  best  cases  seen  in  Table  5  are  those  using  a  non-Gaussian  distribution  for  pick 
errors,  which  rival  the  best  case  resulting  with  the  HDC-RCA  assumptions.  The  use  of  both 
techniques  for  outlier  suppression  now  yields  a  maximum  mislocation  of  3.1  km,  the  smallest 
value  of  this  metric  among  all  12  GMEF  solutions  we  have  presented. 

4.3  Database  Compilation 

The  multiple-event  location  methodology  we  are  developing  is  expected  to  have  less  restrictive 
criteria  on  allowable  event  and  station  geometries  than  previous  methods.  Therefore,  we  are  as¬ 
sembling  a  database  of  event  clusters  from  the  International  Seismological  Centre’s  Reviewed 
Event  Bulletin  (ISC  REB),  which  is  more  complete  than  the  IASPEI  REL.  We  began  this  task  by 
downloading  the  current  ISC  REB,  which  spans  the  time  frame  between  January  1963  and  Febru¬ 
ary  2012.  There  are  readings  prior  to  1960,  but  the  events  are  sparsely  recorded  and  not  of  much 
use  for  our  purposes.  The  bulletin  is  stored  as  year-long,  downloadable  files  in  the  IASPEI  Seismic 
Format  (ISF)  from  2004  to  2012.  Prior  to  2004,  the  bulletins  must  be  retrieved  through  web  service 
calls,  using  a  script  that  downloads  one  day  at  a  time.  Dr.  Reiter  wrote  the  scripts  to  retrieve  the 
bulletins  and  then  adapted  a  program  from  Dr.  Rodi  to  convert  the  results  from  HTML  browser 
format  to  the  ASCII  flat  files  that  are  read  by  the  GMEL  software. 

The  number  of  events  and  station  readings  in  the  ISC  bulletin  goes  up  dramatically  starting 
in  the  1990s.  For  example,  for  years  1963-1989  there  are  197,939  events  with  12,354,668  phase 
readings.  In  the  years  1990-1999  the  bulletin  contains  166,394  events  with  10,058,285  phase 
readings;  the  years  2000-2009  contain  360,457  events  (35,254,318  phases);  and  the  26  months  in 
the  bulletin  from  2010  to  2012  have  94,666  events  (11,138,496  readings).  Put  more  simply,  the 
ISC  bulletin  from  the  most  recent  26  months  contains  nearly  half  of  the  events  and  almost  the 
same  number  of  phases  as  the  bulletin  from  the  first  27  years.  The  next  step  of  database 
compilation  is  to  compile  expanded  data  sets  of  event  clusters  and  their  arrival  picks  for  events 
which  appear  in  the  IASPEI  REL,  and  to  identify  new  clusters  in  areas  of  monitoring  interest. 
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Figure  1 :  Two  GMEL  solutions  for  the  Rogun  earthquake  cluster  are  compared  to  the  HDC-RCA 
solution  reported  in  the  IASPEI REL.  Left:  Multiple-event  location  was  applied  to  only  teleseismic 
and  regional  data.  Right:  Local  arrival  times  were  added  with  their  travel-time  corrections  fixed 
to  zero,  which  emulates  the  HDC-RCA  method.  In  each  plot,  a  line  connects  the  GMEL  epicenter 
for  each  event  with  the  corresponding  HDC-RCA  epicenter. 
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Figure  2:  Same  as  Fig.  1  except  that  a  hybrid  velocity  model  incorporating  the  crustal  model 
used  by  Bondar  et  al.  (2008)  was  used  (instead  of  AK135)  as  the  reference  model  for  travel-tine 
calculation. 
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Absolute  locations  Cluster  vectors 


Figure  3:  GMEL  solution  for  the  epicenters  of  57  Pahute  Mesa  explosions  obtained  with  single¬ 
event  location.  The  GMEL  epicenters  are  compared  to  the  GTO  epicenters  in  two  ways.  Left:  the 
absolute  GMEL  epicenters  and  mislocation  vectors  are  plotted.  Right:  the  GMEL  epicenters  are 
shifted  to  match  the  centroid  of  the  GTO  epicenters,  this  displaying  relative  mislocations,  i.e.  errors 
in  the  cluster  vectors. 
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Figure  4:  GMEL  solution  for  the  Pahute  Mesa  explosion  epicenters  obtained  with  multiple-event 
location.  The  arrival-time  data  set  includes  52  Pg  phases  from  three  local  station,  whose  travel¬ 
time  corrections  were  fixed  to  zero  to  emulate  the  RCA  procedure.  The  solution  is  displayed  in  the 
same  format  as  Fig.  3. 
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Single-event  location  Multiple-event  location 


Figure  5:  The  GMEL  solutions  for  the  Pahute  Mesa  explosions  in  Figs.  3  (single-event  location) 
and  4  (multiple-event  location)  are  shown  as  polar  plots  of  mislocation  vectors.  The  origin  of 
each  plot  represents  the  GTO  epicenters,  while  the  dots  are  the  mislocation  vectors  for  the  GMEL 
solution. 
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Figure  6:  Mislocation  vectors  of  GMEL  solutions  for  the  epicenters  of  57  Pahute  Mesa  explosions 
for  four  different  treatments  of  pick  errors.  Top  left:  pick  errors  are  Gaussian  (p  =  2)  with  pre¬ 
assigned  (fixed)  variances  (repeats  Fig.  5  right).  Top  right:  non-Gaussian  (p=  1.25)  pick  errors 
with  fixed  variances.  Bottom  left:  Gaussian  with  adjustable  variances,  solved  for  as  part  of  the 
GMEL  solution.  Bottom  right:  non-Gaussian  with  adjustable  variances. 
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Figure  7:  The  first-arrival  P-wave  travel-time  correction  standard  deviations  for  the  paths  in  the 
Pahute  Mesa  data  set  are  plotted  as  a  function  of  event-station  distance,  as  computed  from  the 
geostatistical  parameters  of  velocity  listed  in  Table  3. 
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Figure  8:  Mislocation  vectors  of  GMEL  solutions  for  the  epicenters  of  57  Pahute  Mesa  explosions, 
obtained  with  a  variance-covariance  matrix  for  travel-time  corrections  derived  from  geostatistical 
parameters  for  velocity  heterogeneity.  The  results  are  shown  in  the  same  format  and  for  the  same 
four  cases  as  Fig.  6. 
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Figure  9:  Same  as  Fig.  8  except  the  variance-covariance  matrix  for  travel-time  corrections  was 
modified  by  setting  the  variances  for  local  stations  to  zero,  as  was  done  for  the  diagonal  variance 
matrix  cases  in  Fig.  6. 
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5  CONCLUSIONS 


The  first  year  of  the  project  was  devoted  largely  to  the  development  cf  our  new  multiple-event 
location  methodology,  including  the  mathematical  formulation  of  the  approach  and  implementation 
of  key  elements  of  the  approach  in  the  GMEL  locator.  The  basic  elements  have  been  implemented 
and  tested,  resulting  in  numerical  algorithms  based  on  grid-search  and  nonlinear  conjugate 
gradients  to  solve  the  multiple-event  location  problem  under  general  stochastic  assumptions  about 
travel-time  corrections  and  pick  errors.  Some  important  aspects  of  our  approach  await 
implementation,  includ-ing  uncertainty  analysis  within  our  more  general  framework  and  the 
extension  to  event-dependent  travel-time  corrections. 

We  tested  our  current  version  of  GMEL  with  data  from  two  event  clusters  in  the  IASPEI  refer¬ 
ence  event  list  (REL).  Application  to  the  Rogun  earthquake  cluster  in  Tajikistan  corroborated  our 
theoretical  conclusion  that  our  general  approach  incorporates  the  HDC-RCA  method  as  a  special 
case.  Application  to  a  cluster  of  57  explosions  in  the  Pahute  Mesa  testing  area  of  Nevada  Test 
Site,  for  which  GTO  location  information  is  available,  showed  that  our  method  can  achieve  epicen- 
tral  locations  with  errors  of  3  km  and  less  when  outlier  suppression  techniques  we  implemented 
are  invoked,  consisting  of  a  non-Gaussian  stochastic  model  for  pick  errors  and  automatic  adjust¬ 
ment  of  pick-error  variances.  However,  tests  of  our  more  general  stochastic  model  of  travel-time 
corrections,  which  employs  a  full  variance-covariance  matrix  derived  from  a  geostatistical  charac¬ 
terization  of  unknown  velocity  anomalies  in  the  Earth,  did  not  reduce  mislocation  of  the  explosions 
at  Pahute  Mesa,  indicating  the  need  for  further  work  on  this  element  of  our  approach.  We  initiated 
efforts  on  the  compilation  of  data  sets  for  more  clusters  using  the  more  complete  International 
Seismological  Centre’s  reviewed  event  bulletin,  which  will  allow  more  exhaustive  testing  of  our 
multiple-event  location  methodology  in  the  second  year  of  the  project. 
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